5 research outputs found
Unsupervised Learning Facial Parameter Regressor for Action Unit Intensity Estimation via Differentiable Renderer
Facial action unit (AU) intensity is an index to describe all visually
discernible facial movements. Most existing methods learn intensity estimator
with limited AU data, while they lack generalization ability out of the
dataset. In this paper, we present a framework to predict the facial parameters
(including identity parameters and AU parameters) based on a bone-driven face
model (BDFM) under different views. The proposed framework consists of a
feature extractor, a generator, and a facial parameter regressor. The regressor
can fit the physical meaning parameters of the BDFM from a single face image
with the help of the generator, which maps the facial parameters to the
game-face images as a differentiable renderer. Besides, identity loss, loopback
loss, and adversarial loss can improve the regressive results. Quantitative
evaluations are performed on two public databases BP4D and DISFA, which
demonstrates that the proposed method can achieve comparable or better
performance than the state-of-the-art methods. What's more, the qualitative
results also demonstrate the validity of our method in the wild
mEBAL: A Multimodal Database for Eye Blink Detection and Attention Level Estimation
This work presents mEBAL, a multimodal database for eye blink detection and
attention level estimation. The eye blink frequency is related to the cognitive
activity and automatic detectors of eye blinks have been proposed for many
tasks including attention level estimation, analysis of neuro-degenerative
diseases, deception recognition, drive fatigue detection, or face
anti-spoofing. However, most existing databases and algorithms in this area are
limited to experiments involving only a few hundred samples and individual
sensors like face cameras. The proposed mEBAL improves previous databases in
terms of acquisition sensors and samples. In particular, three different
sensors are simultaneously considered: Near Infrared (NIR) and RGB cameras to
capture the face gestures and an Electroencephalography (EEG) band to capture
the cognitive activity of the user and blinking events. Regarding the size of
mEBAL, it comprises 6,000 samples and the corresponding attention level from 38
different students while conducting a number of e-learning tasks of varying
difficulty. In addition to presenting mEBAL, we also include preliminary
experiments on: i) eye blink detection using Convolutional Neural Networks
(CNN) with the facial images, and ii) attention level estimation of the
students based on their eye blink frequency
MEBAL: A multimodal database for eye blink detection and attention level estimation
© ACM 2020. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in ICMI '20 Companion, October 25–29, 2020, Virtual Event, Netherlands, https://doi.org/10.1145/3395035.3425257This work presents mEBAL, a multimodal database for eye blink detection and attention level estimation. The eye blink frequency is related to the cognitive activity and automatic detectors of eye blinks have been proposed for many tasks including attention level estimation, analysis of neuro-degenerative diseases, deception recognition, drive fatigue detection, or face anti-spoofing. However, most existing databases and algorithms in this area are limited to experiments involving only a few hundred samples and individual sensors like face cameras. The proposed mEBAL improves previous databases in terms of acquisition sensors and samples. In particular, three different sensors are simultaneously considered: Near Infrared (NIR) and RGB cameras to capture the face gestures and an Electroencephalography (EEG) band to capture the cognitive activity of the user and blinking events. Regarding the size of mEBAL, it comprises 6,000 samples and the corresponding attention level from 38 different students while conducting a number of e-learning tasks of varying difficulty. In addition to presenting mEBAL, we also include preliminary experiments on: i) eye blink detection using Convolutional Neural Networks (CNN) with the facial images, and ii) attention level estimation of the students based on their eye blink frequencyThis work has been supported by projects: PRIMA (ITN-2019-860315), TRESPASS-ETN (ITN-2019-860813), IDEA-FAST (IMI2-
2018-15-two-stage-853981), BIBECA (RTI2018-101248-B-I00 MINECOFEDER), and edBB (Universidad Autonoma de Madrid). Ruben Tolosana and postdoc support from CAM/FEDER. Roberto Daza is supported by a PhD FPI fellowship from MINECO-FEDE
AVEC 2018 workshop and challenge: bipolar disorder and cross-cultural affect recognition
International audienceThe Audio/Visual Emotion Challenge and Workshop (AVEC 2018) "Bipolar disorder, and cross-cultural affect recognition" is the eighth competition event aimed at the comparison of multimedia processing and machine learning methods for automatic audiovisual health and emotion analysis, with all participants competing strictly under the same conditions. The goal of the Challenge is to provide a common benchmark test set for multimodal information processing and to bring together the health and emotion recognition communities, as well as the audiovisual processing communities, to compare the relative merits of various approaches to health and emotion recognition from real-life data. This paper presents the major novelties introduced this year, the challenge guidelines, the data used, and the performance of the baseline systems on the three proposed tasks: bipolar disorder classification, cross-cultural dimensional emotion recognition, and emotional label generation from individual ratings, respectively
MMGCN: Multimodal Graph Convolution Network for Personalized Recommendation of Micro-video
10.1145/3343031.3351034ACM MM 20191437-144